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Abstract 

This work considers recovery of signals that are sparse 
over two bases. For instance, a signal might be sparse in 
both time and frequency, or a matrix can be low rank and 
sparse simultaneously. To facilitate recovery, we consider 
minimizing the sum of the li-norms that correspond to 
each basis, which is a tractable convex approach. We 
find novel optimality conditions which indicates a gain 
over traditional approaches where £i minimization is done 
over only one basis. Next, we analyze these optimal- 
ity conditions for the particular case of time-frequency 
bases. Denoting sparsity in the first and second bases 
by ki , ^2 respectively, we show that, for a general class 
of signals, using this approach, one requires as small 
as 0(niax{fci, fc2} loglog n) measurements for success- 
ful recovery hence overcoming the classical requirement 
of e(niin{fci, k2} log( ^{^{fc^^fc^} )) for ii minimization 
when ki ~ k2. Extensive simulations show that, our 
analysis is approximately tight. 

Index Terms — basis pursuit, compressed sensing, phase re- 
trieval, duality, convex optimization 

I. Introduction 

Compressed sensing is concerned with the recovery of 
sparse vectors and has recently been the subject of im- 
mense interest. One of the main methods is Basis Pursuit 
(BP) where the £i norm is minimized subject to convex 
constraints. Assuming x has a sparse representation over 
the basis U (i.e. Ux is a sparse vector) and assuming we 
get to see the observations Ax, Basis Pursuit performs 
the following optimization to get back to x. 

min||Ux||i subject to Ax = Ax (BP) 

X 

In this work, we'll be investigating recovery of vectors 
that can be sparsely represented over two bases. For 
example, a vector such as a Dirac comb can be sparse in 
time and frequency. Similarly, we can consider a low rank 
matrix which is supported over an unknown submatrix and 
zero elsewhere and hence sparse. Assuming x is sparse 
over Ui,U2, in order to induce sparsity in both bases. 

This work was supported in part by tlie National Science Foundation 
under grants CCF-0729203, CNS-0932428 and CCF-1018927, by the 
Office of Naval Research under the MURI grant N00014-08-1-0747, 
and by Caltech's Lee Center for Advanced Networking. 



we will be considering the following approach, which we 
call loint Basis Pursuit (JBP). 

min||Uix||i + AIIU2XII2 s.t. Ax = Ax (JBP) 

For the case of a matrix X that is simultaneously sparse 
and low rank, we may minimize the summation of £1 
norm and the matrix nuclear norm, which is denoted by 
II • II* and is equal to summation of the singular values. 
Assuming, we observe linear measurements ^(X), we 
propose solving the following problem (JBP-Matrix) to 
recover X. 

min||X|U + A||X||i s.t. A(K) ^ A{X.) (JBPM) 

X 

While it is possible to come up with relevant problems, 
this paper will focus on JBP and JBPM. Our motivations 
are, 

• Investigating whether JBP can outperform regular 
BP 

• The sparse phase retrieval problem, in which one 
has measurements of a sparse vector x and observe 
I (a^jx) p as measurements jT), El- While it is not 
possible to cast this as a regular compressed sensing 
problem, it can be cast as JBPM where we wish 
to recover sparse and low rank matrix, xx*. This 
problem is known to have applications to X-Ray 
crystallography [6J and has recently atttacted interest 

Q-inoi. 

Background: It should be emphasized that, recently, there 
has been significant interest in using a combination of 
different norms to exploit the structure of a signal. While 
this paper deals with signals having sparse representations 
in both bases, L3J-L5J considers the problem of separating 
the signals that are combinations of sparsely representable 
incoherent pieces. 

Contributions: In this work we provide sharp recovery 
conditions that guarantees success of JBP and JBPM. 
Next, we cast these conditions in a dual certificate frame- 
work to facilitate analysis. For the case of time-frequency 
bases, we analyze the dual certificate construction to 
find that for the class of "periodic signals", one needs 
at most 0(max{/ci, fc2} loglogn) measurements where 



ki, k2 represents the sparsity in Ui, U2. This shows that 
JBP can indeed outperform regular BP which requires 
Q{k log measurements for recovery of a fc sparse vec- 
tor lfT2l . lfT3l . Finally, simulation results indicate that our 
results are sharp. We believe that, the result of this paper 
can be seen as negative in nature. While, JBP provides 
an improvement, it is not a significant improvement when 
we consider the fact that signals that are simultaneously 
sparse are few in number. 

II. Problem Setup 

We begin by considering the (JBP) problem and as- 
sume X G C" is a signal that is sparse over two 
complete bases, Ui, U2. Later on we will briefly extend 
our approach to (JBPM) and the recovery of matrices that 
are simultaneously sparse and low rank. 

The basic question we would like to answer is whether 
one can do better in recovering x from measurements Ax 
by exploiting the joint sparsity of x. 

Before, going into technical details, we'll introduce the 
relevant notation. Denote the set {1, 2, . . . , n} by [n]. Let 
Si,S2 C [n] denote the supports of x in the bases Ui 
and U2, i.e., locations of nonzero entries of Uix and 
U2X respectively. Further, let Si{-) : C" Cl-^il,52(-) : 
C" — >■ Cl'^^l denote the operators that collapse a vector 
onto Si, 5*2 respectively, sgn(-) : C" — > C" is the function 
that returns entry wise signs of a vector, i.e., is mapped 
to and a 7^ is mapped to j^. I will be the identity 
matrix of the appropriate size. Null space of a linear op- 
erator A is denoted by 7V(A). n{-),I{-) : C" R" are 
the functions that returns entry-wise real and imaginary 
parts of a vector Denote 1 by i. D is the Discrete 
Fourier Transform (DFT) matrix of the appropriate size 
and given as follows, 

^(z-l)O-l) 

D,,, = p l<hj<n (1) 

where W is always exp(— ^). We will use Ai,A2 and 
1 , A alternatively. 

Remark: Proofs that are omitted can be found in the 
appendix. 

A. Recovery Conditions for JBP 

We will start with explaining our approach. Let A € 
(^mxn js fjjg number of measurements. The 

following lemma gives a condition that guarantees x to 
be the unique optimum of (JBP). 

Lemma II.l (Null Space Condition). Assume, for all w e 
A/'(A), the following holds, 

2 

^A,(7e((^g«(U,x),U,w)) + |5,(U,w)|i) >0 (2) 
Then, x is the unique optimizer of (JBP). 



Proof: Let /(x) be the cost of (JBP) , i.e., /(x) = 
A,;||U,x||i. Then, for any w e N{A), /(x + w) - 
/(x) is lower bounded by the left hand side of (|2]i, which 
follows from the sub gradient of the li norm. Hence 
/(x) > /(x) for afl Ax = Ax, x 7^ x. ■ 
Based on (|2|, the following lemma connects success 
of (JBP) to the existence of dual certificates. 

Lemma II.2. Assume Si,S2 G C''",s e C" satisfying the 

following conditions exist: 

. 5i(Ur*(A*si+s)) = 5i(^g«(Uix)) 

. ||5i(Ur*(A*si + s))||oo < 1 

. 52(U^*(A*S2 - s)) = \S2{sgn{V2^)) 

. ||52(U^*(A*S2-s))||oo < A 

• A is invertible over ni=i{'^|'^i(Uiv) = U^v}. 
Then x is the unique optimum of (JBP). 

Proof: What we need to show is that if such Si, S2, s 
exist and the invertibility assumption holds then the left 
hand side of (|2]) is strictly positive for all w e N{A). 
Assume such Si, S2, s exist and let vi, V2 G C" to be: 
vi = U^*(A*Si+s) and V2 = U2"*(A*S2-s*) (3) 

Observe that for any w e A/'(A), using Aw = 0, 

2 2 

^ (UiW,Vi) = ^w*A*Si + w*s - w*s = (4) 

i=l i=l 

To end the proof observe that vi,V2 satisfies the condi- 
tions listed in Lemma III. 21 which implies that the LHS 
of (|2]i is strictly positive when combined with (|4|l. This 
follows from the fact that either 5i(Uiw) or 82(^2^) 
is nonzero due to invertibility assumption. ■ 
The dual certificate approach for regular BP has 
been used in [JJ, IjZJ, [5|. Letting U = Ui, com- 
pared to Lemma III. 21 it requires invertibility of A over 
{v|5i(Uiv) = Uiv} rather than the intersection and it 
requires ||5i(U^*A*si)||oo < 1, while Lemma III.2I can 
overcome this by making use of the extra variable s. From 
this perspective, JBP can be viewed as a combination of 
two regular BP's that are allowed to "help" each other via 
s. 

III. Main Results 

Our main result is concerned with the time-frequency 
bases, i.e.. Identity and the DFT matrices. Before stating 
the main result, let us first describe the setting for which 
it holds. 

Definition III.l. S is a I periodic subset of [n] if n is 
divisible by I and for any i € [n], we have, 

i £ S j € S for all j such that j = i (mod I) (5) 

Observe that if S is a I periodic support, \S\ is divisible 
by n/l. 



Theorem III.l. Ler Ui = I, U2 = D, 1 > a > be an 

arbitrary constant and without loss of generality assume 
\Si\ < |S'2|. Further, assume the fallowings hold, 

• l-^il < isfTT- 

• 81,82 are ni,n2 periodic supports, where n = 

711712- 

. |^2|< |5l|l0g"(77). 

Then, far the fallowing scenarios, x can be successfally 
recovered via JBP with high probability (for sufficiently 
large n) when the matrix A e c^x" generated with 
i.i.d complex Gaussian entries. 

• // 1 5'2 1 ^ 1 5*1 1 log log 71 setting A = 1 and using m = 
0(1 5*2 1 log log 7i) measurements. 

• |<S'2| > |5'i| loglogn, setting A — log^^(7i) and 
using m — 0( 15*2 1) measurements. 

Remark: Our proof approach will inherently re- 
quire 777 > max{|5i|, 15*21}. Consequently, if |52| > 
15*1! log(7i), then one can already perform the regular 
ii optimization over Ui = I to ensure recovery with 
m = 0(|52|) measurements. Hence, 15*21 < |5i | log'^(7i) 
is a reasonable assumption. 

A. Signals with Periodic Supports 

Theorem IIII.ll holds for signals whose supports are 
periodic with 711,712 over I and D respectively, where 
77 — 77i7i2. Here, we give a family of such signals 
that satisfy this requirement. Let T be the set of signals 
V G C" such that for some I < rii and < t < 77, 



Vi = 



if j ^ I (mod 77i) 
W^* else 



(6) 



Basically, T is the set of Dirac combs with period 771 and 
hence for any v e T, Dv will have ^ periodic support. 
In general, almost all x of the form. 



v,eT 



(7) 



will have 711 periodic support and Dx will have ^ 
periodic support. The reason we say almost all is because 
cancellations may occur when v^'s are added. However, if 
aj 's are chosen from a continuous distribution, the chance 
of cancellation is 0. 
B. Converse Results 

We should emphasize that, the main reason we have 
considered the I, D pair is the fact that almost all bases 
Ui and U2 do not permit signals that are sparse in both. 
The following lemma illustrates this. 

Lemma III.l. Assume U]^^,U2^^ have i.i.d entries cho- 
sen from a continuous distribution. Then, with probability 
1, there exists no nonzero vector x satisfying |5i | + 15*2 1 < 

77. 



An interesting work by Tao shows that, such results are 
true even for highly structured bases, fl^. In particular, 
if 77 is a prime number, we still have |5*i| + |5*2| > n 
requirement for a signal over Ui = I and U2 = D bases. 



IV. Proof of Theorem IlILl] 

This section will be dedicated to the analysis of Lemma 
III. 21 to prove Theorem IIII.ll We start by proposing a 
construction for Si, S2, s that certifies optimality of x. 
A. Construction of si, S2, s 

For the following discussion, we'll be using (Ui, U2) 
and (I,D) and (1,A) and (Ai,A2) interchangeably. The 
construction of Si,S2 will follow a classical approach 
previously used in f2l, fSl, [7|. Letting Ag^ g C'"^l^il 
denote the submatrix by choosing columns corresponding 
to 5i and B — AD*, we will use the following Si,S2. 

si = A5,(A^^A5j-i5i(sgn(x)) (8) 
S2-B5,(B^^B5,)-iA52(sgn(Dx)) (9) 

Since I, D are unitary we have U^* = Ui. By construc- 
tion Si,S2 already satisfies, 

5,(U,A*s,) = A,5,(sgn(U,x)) ie{l,2} (10) 

However, one has to control the term ||5i(UiA*Si)||oo 
and we will make use of s to achieve this. Denote UiA*Si 
by Yi. Define the vectors {bi,b2} as follows: 

[ if J e 5, 

n{h^j) = <^ if j e 8, and |7^(y,,J)| < A,/4 (11) 
\J^{yi,j) ~ A^sgn(7^(y^,J))/4 else 

and imaginary part I{bij) is obtained from I{yij) in the 
same way. Observe that, ||5i(yi — bi)||oo < Ai/2. Based 
on {bj}^^j^ construct s as follows, 

s = D*(b2 — C2) — I(bi — Ci) where (12) 
ci -D*Is,Dbi, C2 =Dl5,D*b2 

Here, Is-^, Is2 ^re diagonal matrices whose diagonal 
entries corresponding to 5*i , 52 are 1 and the rest are zero. 

Lemma IV.l. Assume x, {y;, b^, Ci}^^i are the same as 
described previously. Then, one has the following: 



5i(yi + s) = Si{sgn{x.)) 
52(y2-Ds) = A52(i§n(Dx)) 



l'5i(yi 



< 



\\Si{c 



DWoo 



l'5i(D*b2)||, 



A 



i52(y2 - Ds)||oo < 2 + l|52(c2)||oo + ||52(Dbi)|| 



Based on Lemma II V. 1 1 and Lemma III. 21 JBP recovers 
xjf we have, ||5i(ci)||oo + ||5i(D*b2)||oo < 1/2 and 

||52(c2)||oo + ||52(Dbi)||oo < A/2. 



As a next step, we can analyze ||5i(D*l52Dbi)||oo 
and ||5i(D*b2)|loo and find the conditions that guarantees 
their sum to be small. The analysis for 52 will be identical 
to Si and hence is omitted. 
B. Probabilistic Analysis 

Assume A is i.i.d complex normal with variance 
and m > 64max{|S'i|, |S'2|}. This will guarantee, 

<Jrmn{AsJ > I/V2 and (t™„(BsJ > 1/V2 (13) 

with probability 1 — exp(— f7(m)), ifTTI . 

Now, conditioned on Asi,'Bs2 satisfy (T3i . 

\\s,\\l = s*s, = A2sgn(x)*(AJ^A5j-isgn(x) < 2X^\S.\ 

and Si(yi) is an i.i.d Gaussian vector whose entries hav- 
ing variance " '"^ . Given these, we need to understand, 

torn' ' 



when can we make sure. 



|5i(D*Is,DbO 



< i and ||5i(D*b2)||oo < J 



From (fTTI) . observe that Si{hi) is a function of Si{yi) 
which is i.i.d. random Gaussian. The next lemma, gives 
a characterization of b^. 

Lemma IV.2. Assume m > 64max{|S'i|, |S'2|}. Then, the 
entries {^^(bi)^}^-^'! of Si{hi) are i.i.d. random variables 
with the following distribution, 

{0 with probability at least 1 — 4exp(— j^^) 
otherwise distributed as z 

(14) 

where z is mean and subgaussian nonets (see [11]) 

of TZ(z),I{z) are upper bounded by Cf)\i\J^^ for an 
absolute constant cq > 0. 

1) Analysis of ||5i(ci)||oo-' We need to show, 

1 



Si{hi)j is 



|5i(D*l5,Dbi; 



< 



(15) 



Calling C = D*l5,D, from Lemma [VII.ll each row of 
C has energy Let be the i'th column of C*. Then, 
using Lemma |IV.2| and Proposition 5.10 of ifTTI . for any 
i and an absolute constant c > 0, 

1 mr 

P(|c*bi| > -) < 12exp(- 



12exp(- 



2^cl\Si\\ 
mnc 



|c.,T||i' 



r) 



(16) 
(17) 



2^cl\Si\\S2\ 

Using a union bound over all I's, shows ( fTsT i reduces 
to arguing nP(|c*bi| > -j) — > which is equivalent to 
ensuring, 

mnc 



logn 



00 



00 



(18) 



Using n > min{ \Si\, 1 5*2 1 } log n in the statement of Theo- 
remUlLT] ^ holds for m > 2^c-^clma^{\Si\, \S2\} = 
0(max{|S'i|, |S'2|}) as desired. 



2) Analysis of ||5i(D*b2)||oo-' 
we would like to show, 

||5i(D*b2)||oo < 



In a similar fashion. 



(19) 



holds with high probability, to conclude. Each row of D* 
has unit £2 norm and nonzero entries of b2 are i.i.d sub- 
gaussians from Lemma |IV.2| Letting, p = 4exp(— ]^g"g^| ) 
and applying a Chernoff bound w.p.a.l 1 — exp(— r7,p/4), 
number of non zeros in b2 is at most 2np. Considering 
the inner products between each row of D* and b2, and 
using a union bound, fT% holds, with probability at least, 

' - ^'"^"P(- 2Bc§A^|52b ^ - exp(-np/4) (20) 

Assuming rn ~ 0(|S'2| log"(n)) for some a < 1, we have 

exp(— np/4) — > 0. Finally, to show the second term in 

(|20] l approaches 0, for some absolute constants ci , C2 > 0, 

we need to argue, 

771 vn 
— r2lTrTexp(— r-rT)-logn-;> 00 as n -j> 00 (21) 

ClA^|52| C2\S2\ 

Following the same arguments for the other basis will 
yield. 



ci\Si 



■exp( 



m 



C2\Sl 



- ) — log 71 — > 00 



as 



00 (22) 



By choosing m — 0(max{|S'i|, 1521} log logn) and 
A = 1 one can always satisfy these. In case |52| > 
1 5i I log logn, choose A — log~^(n) and m sufficiently 
lai-ge but 0( 1 52 1) to still satisfy both. 

V. Empirical Results 

While Theorem IIII.ll shows that JBP can indeed out- 
perform BP it is important to understand how good it 
actually is. We considered the following basic setup: Let 
fc be a positive integer and n ~ k"^. Then, let x e M" be 
the following dirac comb. 



= 1 if i = 1 (mod fc) and else 



(23) 



It is clear that Dx = x hence the signal is only ^/n 
sparse in both domain and the optimal weight in JBP is 
A = 1 by symmetry. Simulation for JBP is performed for 
fc = {2, 4, 6, . . . , 32} and for 1 < to < 30. Interestingly, 
in order to achieve 50% success, JBP required I < m < 
^ and ^ slightly increased as a function of fc. This is 
shown as the straight line in Figure 1. These results are 
quite consistent with Theorem llll. 1 I from which we expect 
to have to = O(fcloglogfc) measurements. 

On the other hand, 50% success curve for BP is shown 
as the dashed line in Figure 1 and obeys to — 0{k log fc) 
as expected from classical results on £1 minimization. In 
particular ^ increases from 1 to 2.4 as fc moves from 
2 to 32. While JBP outperforms BP in this setting, the 



-Joint Basis Pursuit 
- Reguiar Basis Pursuit 




Sparsity (k) 



Fig. 1 . Phase transitions of JBP vs BP where sparsity k varies between 
2 to 32 and n = . Dark region indicates failure for JBP while light 
region corresponds success. Straight and dashed lines are 50% success 
curves for JBP and BP respectively. While JBP outperforms BP, it still 
requires Q,(k) measurements. 

fact that it requires f2(fc) samples to recover a highly 
structured signal is disappointing. It would be interesting 
to see whether a greedy algorithm can be developed to 
attack this problem. 

VI. Extension to Matrices 

As it has been discussed in the introduction, similar to 
jointly sparse signals one might as well consider matrices 
that are sparse and low rank. The motivation is the sparse 
phase retrieval problem where x is a sparse vector to 
be recovered from observations {| (a^, x) where 
{^ili^i G 'C" are the measurement vectors. Although, 
these measurements are not linear in x, they are linear in 
XX* as I (a^jx) p = a*xx*a,;. Using the fact that xx* is 
rank 1 and sparse, JBPM can be used in order to recover 
X = XX* as it will enforce a low-rank and sparse solution. 

Although, this work will not deal with the analysis of 
this problem, we'll point out that our framework for JBP 
can be used for JBPM as well. In general, assume matrix 
X is low-rank and sparse and we wish to recover it from 
observations ^(X). Let us first introduce notation relevant 
to structure of X e C"^". 

• Let S* e [n] X [n] be the usual support of X and 
S : C"^" ^ Cl^l be the projection onto S. 

• Assuming X has singular value decomposition 
USV*, Define the subspace £ G C"^" as. 



£ = {Y e 



(I-UU*)Y(I- VV*) =0} 



£ denotes complement of £ and projection onto C 
is denoted by £(•) : C"^" ^ C"^". 
_4*(.) : C" ^ C"^" denotes the adjoint operator. 
Operator norm is denoted by || • ||. 



The following lemma is effectively equivalent to 
Lemma III. 21 and characterizes a simple condition for X 
to be unique optimizer of JBPM. 



Lemma VLl. Assume 81,82 G C™,8 e 
ing the following conditions exist: 



satisfy- 



. /:(yl*(8i) + 8) = uv*. 

. ||£(X(8i) + 8)|| < L 

. S{A*{S2)-S) = \-S{sgn{A)). 

. ||5(^*(82)-8)||oo < A. 

. A{-) is invertible over {Y\C{Y) = S{Y) ^ Y}. 
Then A is the unique optimum of (JBPM). 

Finally, it would be interesting to see whether similar 
or better improvements can be shown for JBPM over reg- 
ular BP or regular nuclear norm minimization algorithms. 
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VII. Appendix 

We will start by proving Lemma UlI. 1 1 using a classical 
argument. 

Proof of Lemma IIII.lt Let us first fix 5*1 , 52 and 

consider these particular supports. Let S T?,"^''^*' be 
the matrix obtained by taking columns of U~^, over 5*^. 
If Zi = Uix and Z2 = U2X are supported over 81,82, 
we may write: 

= U-^Z, - IJ-'Z2 = [Ci C2][5i(zi)* - 52(Z2)*]* 

(24) 

By assumption, [Ci C2] e has i.i.d. 

entries from a continuous distribution and hence full 
column rank with probability 1 whenever l^ij + 1521 < n. 
It follows that only (zi, Z2) satisfying ( l24b is (0, 0). There 
are finitely many 81, 82 pairs satisfying |5i| + |52| < n 
hence a union bound will still give, with probability 
1, there exists no nonzero vector x having combined 
sparsities of Uix and U2X at most n. ■ 

Following lemma gives a simple but useful property of 
the DFT matrix. 

Lemma VII.l. Let n — nin2 and 81,82 be ni and 112 
periodic supports. Let D G R"^" be the DFT matrix as 
previously. Further, let C = D*l52D. Then, 

1) Cij — Q for any with i ^ j {mod ni). 

2) For any i, i'th row of C satisfies \\ci\\2 = 

3) For any x that is supported on 81, we have, 
5i(Cx) =0. 

4) First three results similarly hold for C = DIs^D*- 

Proof: Let us start by analyzing the matrix D*l52D. 
Let d, be the i'th column of D. Then, 



(25) 



Using ^2 is 712 periodic, for some set T G [rii] (which is 
simply ^2 (mod ni)), we may write, 

rii 

^^Y^'Y^^i,t+cn2^j,t+cn2 (26) 

teT c=l 

Next, for any i ^ j (mod ni) and any t < ni, 

Til— 1 ni — 1 

Til — 1 

c=0 

= niW'^^'''>*d{i - j (mod m)) 



c=0 



where S(k) = 1 ■^=> k 7^ 0. This proves the first 
statement. To show the second, ~ d*Is2D impUes: 



r*r, = d*Is,DD*Is,d, = d*Is,d, = ^ 



52_| 

n 



Third result will be a direct consequence of the first one: 
If X e 5i, then 



(Cx)i — ^ Cj_jXj — ^ CijXj 



(27) 



When i € 81, j G Si, we have i ^ j(mod ni) by 
definition, which implies Cij = C^^Xj = due to the 
first result. Fourth result can be shown by repeating these 
arguments for Dlg^D*. ■ 
Using Lemma lVII.il we'll now proceed with the proof 
of Lemma II V. 1 1 

A. Proof of Lemma JEl] 

Proof: 81 and ^2 components will be analyzed 
seperately. 

Analyzing 81: We may start by considering, yi + s and 
write. 



yi 



yi 
yi 



D*(b2 
D*b2 



- C2) - (bi - ci) 
Is,D*b2-bi+D*Is,Dbi 



First, we'll consider, 5i(yi +s). We have the following, 

'5i(yi) = 5i(sgn(x)) by construction of yi (28) 

5i(D*b2 - l5iD*b2) - 5i((I - l5jD*b2) = (29) 

5i(bi) = by construction of bi (30) 

5i(D*Is2Dbi) = from Lemma ImT] (31) 

Hence, we find, 5i(yi + s) = '5i(yi) ~ 5i(sgn(x)). 
To upper bound ||5i(yi + s)||oo, we may simply use 
||'5i(yi - bi)||oo < 1/2 and write. 



|5i(yi+s)||oo < ||5i(yi-bi)||oo + ||5i(ci; 



|5i(D*b2)||, 



Analyzing ^2: Similarly, for 52 (y2 + Ds), we have the 
following, 

82{y2) — '^'52(sgn(Dx)) by construction (32) 
52(Dbi - l5,Dbi) = 52((I - IsJDbi) = (33) 
820^2) = by construction (34) 
52 (DIs, D*b2) = from Lemma IVin] (35) 

Hence, 52 (y2 — Ds) = A52(sgn(Dx)) as desired. 

To upper bound ||52(y2 - Ds)||oo, we may use ||52(y2- 

b2)||oo < A/2 and write, 

||52(y2 - Ds)|ioo < ^ + ||52(c2)||oo + ||52(Dbi)||oo 



B. Proof of Lemma 

Proof: We start by stating a useful lemma on Gaus- 
sian variables, 1 1 1 1. 

Lemma VII.2. Let g be a real standard normal random 
variable. Then, for any i > 



\g\>t) < 2exp(-tV2) 



(36) 



Our discussion will be for Si only. Proof for 52 is 
identical. 

Case 1: Estimating F{Si{bi), = 0) 
Observe that As^ and A 5^ are independent matrices with 
i.i.d. Gaussian entries. Hence, for fixed As^, Ag^ is i.i.d. 
'5i(yi) is a vector with i.i.d. complex Gaussian entries 
-. Next, from (fTTT l it can be seen that 



with variance — 
iSi(bi) is an entry wise function of iSi(yi) and hence i.i.d. 
Using Lemma rvil.2l and conditioned on amin{A-Si) > 
l/\/2 for any i e Si 

1 TTl 

P(7^(6l^.) = 0) = P(|7^(yl,.)| < 4) > 1-2 exp(- 

(37) 

as variance of Tl{yi,i) is at most -1^. Using a union 
bound over real and imaginary parts of bij, we find. 



P(7^(6l,,) = 0) > l-4exp(' 



l6\Si 



■) (38) 



Case 2: Subgaussian norm wlien Si{hi)j ^ 

Let us first define a subgaussian random variable and its 
norm. 

Definition VII.l. Let z G M fee a scalar random variable. 
Assume for some K < cxi, 

(E[|z|"])i/" < for all integers n > 1 (39) 

Then, z is a subgaussian random variable and smallest 
K satisfying ( 1591 ) is norm of z. 

Assume i ^ Si. This time, we consider the case where 
|yi^i| > 0. Clearly real and imaginary components of 
bi^i are independent as it is the case for yi ^. Without 
loss of generality consider the real part. Observe that, if 
7?.(&i,i) then it is TZ{yi.i) - jSgn{Tl{yi.i)) where 
var(7?,(yi^i)) < < ^ by assumption. Hence, using 
following lemma we can conclude that subgaussian norm 
of bij is upper bounded by co^\Si\/m as 1/4 > V2/8. 

Lemma VII.3. Let c > \f2 be a scalar, x be a standard 
normal random variable and. 



Proof: Following inequality is true for tail of Gaus- 
sian p.d.f. 



1 



V27ra; vSttx 
Hence, using c > \/2, for t > we have, 

cexp(-(t + c)V2) 



cxp(-j:^/2) 



Q(c) - {t + c){l-c- 
< 2exp(-tV2) 



lexp(-c2/2) 



Result immediately follows from Lemma 5.5 of ifTTl and 
from the bound on P(|z| > t). ■ 
Finally, bi i is zero mean as ^ is distributed sym- 
metrically around and construction of bi^i preserves the 
symmetry. H 

C. Proposition 5.10 and sums of sub-gaussians 

Next, we state Proposition 5,10 of 1 1 1 1 for com- 
pleteness, which gives a bound on weighted sum of 
subgaussians. 

Tlieorem VII.l (Proposition 5.10 of fTTl). Lef zi, . . . , z, 

be subgaussian random variables with subgaussian norms 
upper bounded by cq > 0. Let a £ 7i} be an arbitrarily 
chosen vector Then, for all t >Q, 



|^a,z,| ><) ^3exp(--5^) 



(41) 



where c> Q is an absolute constant. 

Based on this, we can obtain ( fT6b as 72.(5i(bi)) is 
i.i.d. subgaussian with norm at most CQyJ\Si\/m and we 
need to argue both contributions from real and imaginary 
parts are at most ji= with high probability. In particular 
for j'th row of C, 

(42) 

Writing similar bounds for \J2i^{cj.i)Ti-{bi,i)\, 
I 72.(cj,i)2^(^i,i)l' \ J2i^icj,i)^{bi,i)\ we can conclude 
in ( fT6l ). Similarly, to obtain, ( l20l ). we again use bounds 
on real and imaginary parts. This time we consider only 
the nonzero entries which are at most 2np with high 
probability. Then, denoting, for j'th row of D* we can 
write. 



c • sgn{x) conditioned on \x\ > c 



(40) 



J2 7^(rJ„)7^(&2,.)| > ^) <3exp 



2^p\S2\X^ 



Then, z has subgaussian norm at most cq for some 
absolute constant cq. 



Doing this for all components and union bounding simi- 
larly yields 



D. Proof of Lemma lVLT] 

Finally, we give the proof of Lemma IVI.ll which is 
quite similar to the proof of Lemma III. 21 

Proof: Following the notation introduced for the 
matrix case, we need to show if such Si, S2, S exist then 
a certain null space condition will hold for A which will 
guarantee recovery. Let us state this condition based on 
the sub gradients of nuclear norm and £1 norm: For all 
W e M{A) if the following holds then A is the unique 
optimum of JBPM. 

/(W) :=A[7^((sgn(A),5(W))) + ||5(W)||i] (43) 
+ 7^((UV^W)) + ||£(W)||. >0 (44) 

Now, assume such 81,82,8 exist and consider vi,V2 
where: 

vi=yt*(8i) + 8 and V2=yl*(82)- 8 (45) 

Observe that for any W e M{A), we have 
(vi + V2, W) = 0. Now, using this: 

0^7^((vl+V2,W)) (46) 
= 7^(A • (sgn(A), 5(W)) + {S{A*{^2) - S), W)) 
+ 7^((UV^W) +(£(yt*(8i) + 8),W)) 

To end the proof, using invertibility of ^(•) on £ n 5 we 
can conclude >C(W) 7^ or 5(W) 7^ hence: 

n{{C{A*{Si) + ^),W)) <\\C{W)\U or (47) 
7^((5(^*(82) - 8), W)) < A||5(W)||^ (48) 

Overall, existence of 81,82,8 implies the desired null 
space condition, i.e., /(W) > for all W e M{A). ■ 



